The search functionality is under construction.

Author Search Result

[Author] Masanori HARIYAMA(29hit)

21-29hit(29hit)

  • An Asynchronous FPGA Based on LEDR/4-Phase-Dual-Rail Hybrid Architecture

    Shota ISHIHARA  Yoshiya KOMATSU  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-Electronic Circuits

      Vol:
    E93-C No:8
      Page(s):
    1338-1348

    This paper presents an asynchronous FPGA that combines 4-phase dual-rail encoding and LEDR (Level-Encoded Dual-Rail) encoding. 4-phase dual-rail encoding is employed to achieve small area and low power for function units, while LEDR encoding is employed to achieve high throughput and low power for the data transfer using programmable interconnection resources. Area-efficient protocol converters and their control circuits are also proposed in transistor-level implementation. The proposed FPGA is designed using the e-Shuttle 65nm CMOS process. Compared to the 4-phase-dual-rail-based FPGA, the throughput is increased by 69% with almost the same transistor count. Compared to the LEDR-based FPGA, the transistor count is reduced by 47% with almost the same throughput. In terms of power consumption, the proposed FPGA achieves the lowest power compared to the 4-phase-dual-rail-based and the LEDR-based FPGAs. Compared to the synchronous FPGA, the proposed FPGA has lower power consumption when the workload is below 35%.

  • A Multi-Context FPGA Using Floating-Gate-MOS Functional Pass-Gates

    Masanori HARIYAMA  Sho OGATA  Michitaka KAMEYAMA  

     
    PAPER

      Vol:
    E89-C No:11
      Page(s):
    1655-1661

    Multi-context FPGAs (MC-FPGAs) have multiple memory bits per configuration bit forming configuration planes for fast switching between contexts. The additional memory planes cause a large overhead in area when a number of contexts are used. To overcome the overhead, a fine-grained MC-FPGA architecture using a floating-gate-MOS functional pass gate (FGFP) is presented which merges threshold operation and storage function on a single floating-gate MOS transistor. The test chip is designed using a 0.35 µm CMOS-EPROM technology. The transistor count of the proposed multi-context switch (MC-switch) is reduced to 13% in comparison with SRAM-based one. The total area of the proposed MC-FPGA is reduced to about 56% of that of a conventional SRAM-based MC-FPGA.

  • FPGA Implementation of a Stereo Matching Processor Based on Window-Parallel-and-Pixel-Parallel Architecture

    Masanori HARIYAMA  Yasuhiro KOBAYASHI  Haruka SASAKI  Michitaka KAMEYAMA  

     
    PAPER-VLSI Architecture

      Vol:
    E88-A No:12
      Page(s):
    3516-3522

    This paper presents a processor architecture for high-speed and reliable stereo matching based on adaptive window-size control of SAD (Sum of Absolute Differences) computation. To reduce its computational complexity, SADs are computed using images divided into non-overlapping regions, and the matching result is iteratively refined by reducing a window size. Window-parallel-and-pixel-parallel architecture is also proposed to achieve to fully exploit the potential parallelism of the algorithm. The architecture also reduces the complexity of an interconnection network between memory and functional units based on the regularity of reference pixels. The stereo matching processor is implemented on an FPGA. Its performance is 80 times higher than that of a microprocessor (Pentium4@2 GHz), and is enough to generate a 3-D depth image at the video rate of 33 MHz.

  • Evaluation of a Field-Programmable VLSI Based on an Asynchronous Bit-Serial Architecture

    Masanori HARIYAMA  Shota ISHIHARA  Michitaka KAMEYAMA  

     
    PAPER

      Vol:
    E91-C No:9
      Page(s):
    1419-1426

    This paper presents a novel asynchronous architecture of Field-programmable gate arrays (FPGAs) to reduce the power consumption. In the dynamic power consumption of the conventional FPGAs, the power consumed by the switch blocks and clock distribution is dominant since FPGAs have complex switch blocks and the large number of registers for high programmability. To reduce the power consumption of switch blocks and clock distribution, asynchronous bit-serial architecture is proposed. To ensure the correct operation independent of data-path lengths, we use the level-encoded dual-rail encoding and propose its area-efficient implementation. The proposed field-programmable VLSI is implemented in a 90 nm CMOS technology. The delay and the power consumption of the proposed FPVLSI are respectively 61% and 58% of those of 4-phase dual-rail encoding which is the most common encoding in delay insensitive encoding.

  • Architecture of an Asynchronous FPGA for Handshake-Component-Based Design

    Yoshiya KOMATSU  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-Architecture

      Vol:
    E96-D No:8
      Page(s):
    1632-1644

    This paper presents a novel architecture of an asynchronous FPGA for handshake-component-based design. The handshake-component-based design is suitable for large-scale, complex asynchronous circuit because of its understandability. This paper proposes an area-efficient architecture of an FPGA that is suitable for handshake-component-based asynchronous circuit. Moreover, the Four-Phase Dual-Rail encoding is employed to construct circuits robust to delay variation because the data paths are programmable in FPGA. The FPGA based on the proposed architecture is implemented in a 65 nm process. Its evaluation results show that the proposed FPGA can implement handshake components efficiently.

  • Minimizing Energy Consumption Based on Dual-Supply-Voltage Assignment and Interconnection Simplification

    Masanori HARIYAMA  Shigeo YAMADERA  Michitaka KAMEYAMA  

     
    PAPER

      Vol:
    E89-C No:11
      Page(s):
    1551-1558

    This paper presents a design method to minimize energy of both functional units (FUs) and an interconnection network between FUs. To reduce complexity of the interconnection network, data transfers between FUs are classified according to FU types of operations in a data flow graph. The basic idea behind reducing the complexity of the interconnection network is that the interconnection resource can be shared among data transfers with the same FU type of a source node and the same FU type of a destination node. Moreover, an efficient method based on a genetic algorithm is presented.

  • Task Allocation with Algorithm Transformation for Reducing Data-Transfer Bottlenecks in Heterogeneous Multi-Core Processors: A Case Study of HOG Descriptor Computation

    Hasitha Muthumala WAIDYASOORIYA  Daisuke OKUMURA  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E93-A No:12
      Page(s):
    2570-2580

    Heterogeneous multi-core processors are attracted by the media processing applications due to their capability of drawing strengths of different cores to improve the overall performance. However, the data transfer bottlenecks and limitations in the task allocation due to the accelerator-incompatible operations prevents us from gaining full potential of the heterogeneous multi-core processors. This paper presents a task allocation method based on algorithm transformation to increase the freedom of task allocation. We use approximation methods such as CORDIC algorithms to map the accelerator-incompatible operations to accelerator cores. According to the experimental results using HOG descriptor computation, the proposed task allocation method reduces the data transfer time by more than 82% and the total processing time by more than 79% compared to the conventional task allocation method.

  • Low-Power Field-Programmable VLSI Using Multiple Supply Voltages

    Weisheng CHONG  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-Low Power Methodology

      Vol:
    E88-A No:12
      Page(s):
    3298-3305

    A low-power field-programmable VLSI (FPVLSI) is presented to overcome the problem of large power consumption in field-programmable gate arrays (FPGAs). To reduce power consumption in routing networks, the FPVLSI consists of cells that are based on a bit-serial pipeline architecture which reduces routing block complexity. Moreover, a level-converter-less multiple-supply-voltage scheme using dynamic circuits is proposed, where the cells in non-critical paths use a low supply voltage for low power under a speed constraint. The FPVLSI is evaluated based on a 0.18-µm CMOS design rule. The power consumption of the FPVLSI using multiple supply voltages is reduced to 17% or less compared to that of the static-circuit-based FPVLSI using multiple supply voltages.

  • Evaluation of an FPGA-Based Heterogeneous Multicore Platform with SIMD/MIMD Custom Accelerators

    Yasuhiro TAKEI  Hasitha Muthumala WAIDYASOORIYA  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2576-2586

    Heterogeneous multi-core architectures with CPUs and accelerators attract many attentions since they can achieve power-efficient computing in various areas from low-power embedded processing to high-performance computing. Since the optimal architecture is different from application to application, finding the most suitable accelerator is very important. In this paper, we propose an FPGA-based heterogeneous multi-core platform with custom accelerators for power-efficient computing. Using the proposed platform, we evaluate several applications and accelerators to identify many key requirements of the applications and properties of the accelerators. Such an evaluation is very important to select and optimize the most suitable accelerator according to the requirements of an application to achieve the best performance.

21-29hit(29hit)